---
title: GenAI feature considerations
description:  Learn to create and apply your own text data for use by an LLM in GenAI.
section_name: Generative AI
maturity: public-preview
platform: cloud-only

---

# GenAI feature considerations {: #genai-feature-considerations }

When working with generative AI capabilities in DataRobot, consider the following. Note that as the product continues to develop, some considerations may change.

**Trial users**: See the considerations specific to the [DataRobot free trial](#trial-user-considerations), including [supported LLM base models](#llm-availability).

## General considerations {: #general-considerations }

* Fewer embeddings are supported through the UI than through the API.

* If a multilingual dataset exceeds the limit associated with the multilingual model, DataRobot defaults to using the `jinaai/jina-embedding-t-en-v1` embedding model.

* There is no support for adding external/custom vector databases or custom LLMs through the UI.

* When using LLMs, be aware of the vendor's model versioning and end-of-life schedules. As a best practice, use only endpoints that are generally available when deploying to production.

* Chatting with a single LLM blueprint in the playground is the only place where previous chat history is taken into account. Comparison prompts and prompts submitted to custom models deployed from the playground do not include previous prompts (history) as context.


### LLM availability {: #llm-availability }

The following table describes the availability of LLMs:

Type | US cluster | EU cluster
---- | ---------- | ----------
Azure OpenAI GPT-4 | ✔ |  ✔
Azure OpenAI GPT-4 32k  | ✔ |  ✔
Azure OpenAI GPT-3.5 Turbo 16k | ✔ |  ✔
Azure OpenAI GPT-3.5 Turbo&ast;  | ✔ |  ✔
Google Bison&ast; |  ✔ |  ✔
Amazon Titan&ast; |  ✔ |  ✔

&ast; Available for trial users, cluster-dependent.


## Playground considerations {: #playground-considerations }

* Playground sharing is not supported; each user collaborating in a Use Case will see only the playgrounds they have created.


## Vector database considerations {: #vector-database-considerations }

The following sections describe considerations related to [vector databases](vector-dbs):

* [Supported dataset types](#supported-dataset-types)
* [Dataset limits](#dataset-limits)

### Supported dataset types {: #supported-dataset-types }

{% include 'includes/genai/genai-zip-include.md' %}

Regarding file types, DataRobot provides the following support:

* `.txt` documents

* PDF documents
 	* Text-based PDFs are supported.
 	* Image-based PDFs are not fully supported. That is, images are generally ignored but do not lead to errors.
 	* Documents with mixed image and text content are supported; only the text is parsed.
 	* Single documents consisting only of images result in empty documents and are ignored.
 	* Datasets consisting of image-only documents (no text) are not processable.

* Mixed PDF and `.txt` documents in a single dataset are supported.

### Dataset limits {: #dataset-limits }

The global 1GB dataset limit is applied during vector database creation, after the text is extracted from the document. Additional dynamic limits are listed below:

* `jinaai/jina-embedding-t-en-v1`: Supported to the 1GB global limit
* `sentence-transformers/all-MiniLM-L6-v2`: Supported to the  650MB limit
* `Multilingual-e5-base`: Supported to the 250MB limit
* `E5-base-v2`: Supported to the 250 MB limit
* `E5-large-v2`: Supported to the 100MB limit

## Playground deployment considerations {: #playground-deployment-considerations }

Consider the following when registering and deploying LLMs from the playground:

* Setting API keys through the DataRobot credential management system is supported. Those credentials are accessed as environment variables in a deployment.

* Registration and deployment is supported for:

    * All base LLMs in the playground

    * LLMs with vector databases

* Registration and deployment is _not_ supported for draft blueprints.

* The creation of a custom model version from an LLM Blueprint associated with a large vector database (500+ MB) can take a while. You can leave the model workshop while the model is created.

## Trial user considerations {: #trial-user-considerations }

The following considerations apply only to DataRobot free trial users:

* You can create up to 15 vector databases, computed across multiple Use Cases. Deleted vector databases are included in this count.

* You can make 300 LLM API calls, where deleted prompts and responses are also counted. However, only successful prompt response pairs are counted.

* “Bring-your-own” LLMs and vector databases are not available.

See also the section on [LLM availability](#llm-availability).
